The subject matter disclosed herein relates to cognitive data center management and more particularly relates to a system, apparatus, and method for data center management through predictive failure analysis and machine learning.
Data centers process and store large amounts of data on an ongoing basis. Data center infrastructure typically includes multiple rows of IT equipment racks for server nodes, storage enclosures, and other IT equipment. Data centers include power systems that provide power to the IT equipment racks. Cooling systems within the data centers provide cooling flows of air that pass through spaces that separate the rows of IT equipment racks. Managing data centers sometimes requires replacement of devices in the IT equipment due to failure of the devices. Data center management also includes management of power and cooling systems. Data center management further includes management of IT workload and performance. Some data center management tools detect imminent failure of devices at a point in time when repairing or replacing the devices interrupts data center operations.